Intel MIC

Intel Many Integrated Core Architecture (MIC)
Registers
Designer	Intel
Design	manycore extended x86/x64 design
General purpose	Intel Architecture registers
Floating point	512-bit SIMD vector registers

Intel Many Integrated Core Architecture or Intel MIC (pronouced Mike) is a multiprocessor computer architecture developed by Intel incorporating earlier work on the Larrabee multicore architecture, the Teraflops Research Chip multicore chip research project and the Intel Single-chip Cloud Computer multicore microprocessor.

Prototype products, codenamed Knights Ferry were announced and released in 2010 to developers including CERN, Korea Institute of Science and Technology Information (KISTI) and Leibniz Supercomputing Centre. Hardware vendors for prototype boards included IBM, SGI, HP, Dell and others.^[1]

A commercial release, codenamed Knights Corner to be built on a 22nm process is proposed for release late 2012 to 2013. In September 2011 it was announced that the Texas Advanced Computing Center (TACC) will use Knights Corner cards in their 10 PetaFLOPS "Stampede" supercomputer, providing 8 PetaFLOPS of the compute power.^[2]

1 History
2 Knights Landing
3 Competition
4 Design
5 See also
6 References

History

Background

The Larrabee microarchitecture (in development since 2006^[3]) introduced very wide (512-bit) SIMD units to a x86 architecture based processor design, extended to a cache coherent multiprocessor system connected via a ring bus to memory; each core was capable of 4-way multi-threading. Due to the design being intended for GPU as well as general purpose computing the Larrabee chips also included specialised hardware for texture sampling.^[4] The project to produce a GPU retail product directly from the Larrabee research project was terminated in May 2010.^[5]

Another contemporary Intel research project implementing x86 architecture on a many-multicore processor was the 'Single Chip Cloud Computer', (prototype introduced 2009.^[6]), a design mimicking a cloud computing computer datacentre on a single chip with multiple independent cores - the prototype design included 48 cores per chip with hardware support for selective frequency and voltage control of cores to maximise energy efficiency, and incorporated a mesh network for interchip messaging. The design lacked cache coherent cores and focussed on principles that would allow the design to scale to many more cores.^[7]

The Teraflops Research Chip (prototype unveiled 2007.^[8]) was an experimental 80 core chip with two floating point units per core implementing a 96-bit VLIW architecture. The project investigated intercore communication methods, per-chip power management, and achieved 1.01 TFLOPS at 3.16 GHz consuming 62 W of power.^[9]^[10]

Knights Ferry

Intel's MIC prototype board, named Knights Ferry, incorporating a processor codenamed Aubrey Isle was announced 31 May 2010. The product was stated to be a derivative of the Larrabee project and other Intel research including the Single-chip Cloud Computer.^[11]

The development product was offered as a PCIe card with 32 in-order cores at up to 1.2 GHz with 4 threads per core, 2 GB GDDR5 memory,^[12] and 8 MB coherent L2 cache (256 kB per core with 32 kB L1 cache),^[13] and a power requirement of ~300 W,^[12] built at a 45 nm process.^[14] In the Aubrey Isle core a 1,024-bit ring bus (512-bit bi-directional) connects processors to main memory.^[15] Single board performance has exceeded 750 GFLOPS.^[14] The prototype boards only support single precision floating point instructions.^[16]

Knights Corner

The Knights Corner product is expected to be made at a 22 nm process size, using Intel's Tri-gate technology with more than 50 cores per chip, and is expected to lead to commercial products.^[11]^[14]

In June 2011, SGI announced a partnership with Intel to utilize the MIC architecture in its high performance computing products.^[17] In September 2011, it was announced that the Texas Advanced Computing Center (TACC) will use Knights Corner cards in their 10 PetaFLOPS "Stampede" supercomputer, providing 8 PetaFLOPS of the compute power.^[2] According to "Stampede: A Comprehensive Petascale Computing Environment" the "second generation Intel (Knights Landing) MICs will be added when they become available, increasing Stampede's aggregate peak performance to at least 15 PetaFLOPS."^[18]

On November 15, 2011, Intel showed a Knights Corner processor publicly for the first time. It was an early silicon sample. Intel also demonstrated that it was very functional by setting a world record 1 TeraFLOPS of performance for a general purpose processor. The Knights Corner demo showed sustained performance of more than a TeraFLOPS on a wide range of DGEMM operations. Intel emphasized during the demonstration this represented sustained TeraFLOPS (not "raw TeraFLOPS" used by others to get higher but less meaningful numbers), and that it was the first general purpose co-processor to ever achieve TeraFLOPS performance.^[19]^[20]

Knights Landing

Code name for the second MIC architecture processors from Intel.^[21]^[22]

Competition

Intel MIC is designed to compete directly with Nvidia Tesla product line in the HPC market.^[23] Within four years since launch, Tesla co-processors have experienced growing adoption in the HPC community, demonstrated in the November 2011 Top500 list where number of Tesla GPU powered systems grew to 35, more than 2x increase in six months.^[24] And for two years running, Tesla GPUs are powering the "greenest" petaflop system in the Green500 list.^[25] Recently, Chinese researchers have reported using Tesla GPUs to simulate the world's largest molecular dynamics simulation of 110 billion atoms at 1.87 petaflops and to simulate world's first complete H1N1 virus model.^[26]^[27]

Design

The basis of the Intel MIC design is to leverage x86 legacy by creating a x86 compatible multiprocessor architecture that can utilise existing paralellisation software tools.^[14] Programming tools include OpenMP, OpenCL,^[28] Intel Cilk Plus and specialised versions of Intel's Fortran, C++ and math libraries.^[29]

Design elements inherited from the Larrabee project include x86 ISA, 512-bit SIMD units, coherent L2 cache, and ultra-wide ring bus connecting processors and memory.

References

^ Tom R. Halfhill (18 July 2011), "Intel Shows MIC Progress", www.linleygroup.com (The Linley Group), http://www.linleygroup.com/newsletters/newsletter_detail.php?num=4729
^ ^a ^b ""Stampede's" Comprehensive Capabilities to Bolster U.S. Open Science Computational Resources", www.tacc.utexas.edu (Texas Advanced Computing Center), 22 September 2011, http://www.tacc.utexas.edu/news/press-releases/2011/stampede
^ Charlie Demerjian (3 July 2006), "New from Intel: It's Mini-Cores!", www.theinquirer.net (The Inquirer), http://www.theinquirer.net/inquirer/news/1029138/new-from-intel-its-mini-cores
^ Sources:
- Larry Seiler; Doug Carmean; Eric Sprangle; Tom Forsyth; Michael Abrash; Pradeep Dubey; Stephen Junkins; Adam Lake et al. (2008), "Larrabee: A Many-Core x86 Architecture for Visual Computing", ACM Trans. Graph. 27 (3), doi:10.1145/1360612.1360617, http://www.eecs.harvard.edu/~dbrooks/cs246/larrabee_manycore.pdf
- Tom Forsyth, "SIMD Programming with Larrabee", www.stanford.edu (Intel), http://www.stanford.edu/class/ee380/Abstracts/100106-slides.pdf
^ Ryan Smith (25 May 2010), "Intel Kills Larrabee GPU, Will Not Bring a Discrete Graphics Product to Market\", www.anandtech.com (AnandTech), http://www.anandtech.com/show/3738/intel-kills-larrabee-gpu-will-not-bring-a-discrete-graphics-product-to-market
^ Tony Bradley (3 December 2009), "Intel 48-Core "Single-Chip Cloud Computer" Improves Power Efficiency", www.pcworld.com (PCWorld), http://www.pcworld.com/businesscenter/article/183653/intel_48core_singlechip_cloud_computer_improves_power_efficiency.html
^ "Intel Research : Single-Chip Cloud Computer", techresearch.intel.com (Intel), http://techresearch.intel.com/ProjectDetails.aspx?Id=1
^ Ben Ames (11 February 2007), "Intel Tests Chip Design With 80-Core Processor", www.pcworld.com (IDG News), http://www.pcworld.com/article/128924/intel_tests_chip_design_with_80core_processor.html
^ "Intel’s Teraflops Research Chip", download.intel.com (Intel), http://download.intel.com/pressroom/kits/Teraflops/Teraflops_Research_Chip_Overview.pdf
^ Anton Shilov (12 February 2007), "Intel Details 80-Core Teraflops Research Chip", www.xbitlabs.com (Xbit laboratories), http://www.xbitlabs.com/news/cpu/display/20070212224710.html
^ ^a ^b Sources:
- Rupert Goodwins (1 June 2010), "Intel unveils many-core Knights platform for HPC", www.zdnet.co.uk (ZDNet), http://www.zdnet.co.uk/news/desktop-hardware/2010/06/01/intel-unveils-many-core-knights-platform-for-hpc-40089093/
- "Intel News Release : Intel Unveils New Product Plans dor High-Performance Computing", www.intel.com (Intel), 31 May 2010, http://www.intel.com/pressroom/archive/releases/2010/20100531comp.htm
^ ^a ^b Mike Giles (24 June 2010), "Runners and riders in GPU steeplechase", people.maths.ox.ac.uk: pp. 8–10, http://people.maths.ox.ac.uk/gilesm/talks/nag_tpc10.pdf
^ "Fast Sort on CPUs, GPUs and Intel MIC Architectures", techresearch.intel.com (Intel), http://techresearch.intel.com/spaw2/uploads/files/FASTsort_CPUsGPUs_IntelMICarchitectures.pdf, "Section 2.2 Radix sort on MIC Architecture: …The MIC architecture is an x86-based many-core processor architecture based on small in-order cores that uniquely combines full programmability of today’s general-purpose CPU architectures with compute-throughput and memory bandwidth capabilities of modern GPU architectures. Each core is a general-purpose processor, which has a scalar unit based on the Pentium processor design, as well as a vector unit that supports 16 32-bit float or integer operations per clock. The MIC architecture has two levels of cache: low latency L1 cache and larger globally coherent L2 cache that is partitioned among the cores. Knights Ferry (KNF) (an implementation of the MIC architecture), has a 32 kB L1 cache and 256 kB partitioned L2 cache. To further hide latency, each core is augmented with 4-way multithreading."
^ ^a ^b ^c ^d Gareth Halfacree (20 June 2011), "Intel pushes for HPC space with Knights Corner", www.thinq.co.uk (Net Communities Limited, UK), http://www.thinq.co.uk/2011/6/20/intel-pushes-hpc-space-knights-corner/
^ "Intel Many Integrated Core Arhcitecture", www.many-core.group.cam.ac.uk (Intel), December 2010, http://www.many-core.group.cam.ac.uk/ukgpucc2/talks/Elgar.pdf
^ Rick Merritt (20 June 2011), "OEMs show systems with Intel MIC chips", www.eetimes.com (EE Times), http://www.eetimes.com/electronics-news/4217092/OEMs-show-systems-with-Intel-MIC-chips
^ Andrea Petrou (20 Jun 2011), "SGI wants Intel for super supercomputer", news.techeye.net, http://news.techeye.net/hardware/sgi-wants-intel-for-super-supercomputer
^ "knights landing". https://www.ieeecluster.org/images/program/Stampede_Abstracts.pdf. Retrieved November 16, 2011.
^ "Intel's Knights Corner: 50+ Core 22nm Co-processor". http://www.tomshardware.com/news/intel-knights-corner-mic-co-processor,14002.html. Retrieved November 16, 2011.
^ "Intel unveils 1 TFLOP/s Knights Corner". http://www.eetimes.com/electronics-news/4230654/Intel-unveils-1-TFLOP-s-Knight-s-Corner. Retrieved November 16, 2011.
^ "Develop tools to test the correct functionality of Knights Landing(Larrabee 4).". http://www.linkedin.com/in/briandhayes. Retrieved November 16, 2011.
^ "Stampede: A Comprehensive Petascale Computing Environment". https://www.ieeecluster.org/images/program/Stampede_Abstracts.pdf. Retrieved November 16, 2011.
^ "Intel takes wraps off 50-core supercomputing processor plans". http://arstechnica.com/business/news/2011/06/intel-takes-wraps-off-of-50-core-supercomputing-coprocessor-plans.ars. Retrieved June 20, 2011.
^ "Japan's K Computer Tops 10 Petaflops/s to Stay Atop Top500 List". http://www.top500.org/lists/2011/11/press-release. Retrieved November 11, 2011.
^ "NVIDIA's Tesla GPU powers Tsubame 2.0 to green supercomputer supremacy". http://www.engadget.com/2011/11/23/nvidias-tesla-gpu-powers-tsubame-2-0-to-green-supercomputer-sup/. Retrieved November 23, 2011.
^ "GPU Supercomputing Accelerates China's Solar Energy Research". http://www.hpcwire.com/hpcwire/2011-06-09/gpu_supercomputing_accelerates_chinas_solar_energy_research.html. Retrieved June 09, 2011.
^ "Chinese Super Powers First Simulation of Complete H1N1 Virus using GPUs". http://insidehpc.com/2011/11/10/chinese-super-powers-first-simulation-of-complete-h1n1-virus-using-gpus/. Retrieved November 11, 2011.
^ Rick Merritt (20 June 2011), "OEMs show systems with Intel MIC chips", www.eetimes.com (EE Times), http://www.eetimes.com/electronics-news/4217092/OEMs-show-systems-with-Intel-MIC-chips
^ "News Fact Sheet: Intel Many Integrated Core (Intel MIC) Architecture ISC'11 Demos and Performance Description", newsroom.intel.com (Intel), 20 June 2011, http://newsroom.intel.com/servlet/JiveServlet/download/2152-4-5220/ISC_Intel_MIC_factsheet.pdf